Ruby: generate overlay discard predicates#19719
Conversation
fb89f22 to
576f0d2
Compare
b74138e to
8a33cb4
Compare
There was a problem hiding this comment.
Pull Request Overview
This PR adds support for overlay-based discard predicates in the Ruby Tree-sitter extractor, enabling selective filtering of AST nodes and files when running on overlay databases.
- Introduces
OverlayAnnotationandPragmaAnnotationin the QL generator to emit overlay/pragma directives. - Extends
ql_gento produce new predicates (isOverlay,getRawFile,discardFile,discardableAstNode,discardAstNode). - Updates the code generator to conditionally include overlay predicates and adds static definitions in the Ruby QLL.
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| shared/tree-sitter-extractor/src/generator/ql_gen.rs | Added functions to generate overlay discard predicates. |
| shared/tree-sitter-extractor/src/generator/ql.rs | Added TopLevel::Predicate, PragmaAnnotation, OverlayAnnotation, and Expression::Negation. |
| shared/tree-sitter-extractor/src/generator/mod.rs | Renamed add_metadata_relation to overlay_support and wired in overlay predicates. |
| ruby/ql/lib/codeql/ruby/ast/internal/TreeSitter.qll | Added overlay/pragma annotations and new predicates for Ruby and ERB modules. |
Comments suppressed due to low confidence (2)
shared/tree-sitter-extractor/src/generator/ql.rs:271
- Add a space or newline after the closing doc comment for readability, e.g.:
write!(f, "/** {} */ ", qldoc)?;so thatoverlay[...]is clearly separated.
write!(f, "/** {} */", qldoc)?;
shared/tree-sitter-extractor/src/generator/mod.rs:63
- Consider adding unit or integration tests to verify that
overlay_supportcorrectly causes the metadata relation and overlay predicates to be written.
if overlay_support {
jbj
left a comment
There was a problem hiding this comment.
I have only some cosmetic comments
8a33cb4 to
acf0fe3
Compare
576f0d2 to
4029b89
Compare
acf0fe3 to
4875dcb
Compare
|
@aibaars if you assume the generated QL is correct, would you be able to review the generator changes? |
jbj
left a comment
There was a problem hiding this comment.
The QL changes LGTM now, assuming you've tested that it works in practice.
4875dcb to
3d78f35
Compare
aibaars
left a comment
There was a problem hiding this comment.
Looks good to me. Is there any harm in generating these predicates for other tree-sitter based extractors as well? if not , we might just as well include them by default.
|
I think @kaspersv determined that
I don't think there is, but it looks like |
Right. I don't think we normally bother with upgrade scripts for QL-for-QL, but I would prefer not to deal with updating its dbscheme nonetheless, for now. |
Yes, that is correct. In theory they should no longer affect the semantics of overlay analysis and a DCA experiment for Java showed no impact of removing them on overlay analysis accuracy (without Which reminds me, I should update the Incremental CodeQL docs to remove the |
4029b89 to
fa24ee6
Compare
3d78f35 to
c9e7be7
Compare
|
I've removed the
As discussed in the extractor PR, I can do a quick follow-up PR that adds the |
fa24ee6 to
665df4b
Compare
c9e7be7 to
c968a95
Compare
The base branch was changed.
|
|
||
| /** Holds if the database is an overlay. */ | ||
| overlay[local] | ||
| private predicate isOverlay() { databaseMetadata("isOverlay", "true") } |
Check warning
Code scanning / CodeQL
Dead code Warning
|
|
||
| /** Gets the file containing the given `node`. */ | ||
| overlay[local] | ||
| private @file getNodeFile(@ruby_ast_node node) { |
Check warning
Code scanning / CodeQL
Dead code Warning
|
|
||
| /** Holds if `file` was extracted as part of the overlay database. */ | ||
| overlay[local] | ||
| private predicate discardFile(@file file) { isOverlay() and file = getNodeFile(_) } |
Check warning
Code scanning / CodeQL
Dead code Warning
|
|
||
| /** Holds if `node` is in the `file` and is part of the overlay base database. */ | ||
| overlay[local] | ||
| private predicate discardableAstNode(@file file, @ruby_ast_node node) { |
Check warning
Code scanning / CodeQL
Dead code Warning
|
|
||
| /** Holds if `node` should be discarded, because it is part of the overlay base and is in a file that was also extracted as part of the overlay database. */ | ||
| overlay[discard_entity] | ||
| private predicate discardAstNode(@ruby_ast_node node) { |
Check warning
Code scanning / CodeQL
Dead code Warning
|
|
||
| /** Gets the file containing the given `node`. */ | ||
| overlay[local] | ||
| private @file getNodeFile(@erb_ast_node node) { |
Check warning
Code scanning / CodeQL
Dead code Warning
|
|
||
| /** Holds if `file` was extracted as part of the overlay database. */ | ||
| overlay[local] | ||
| private predicate discardFile(@file file) { isOverlay() and file = getNodeFile(_) } |
Check warning
Code scanning / CodeQL
Dead code Warning
|
|
||
| /** Holds if `node` is in the `file` and is part of the overlay base database. */ | ||
| overlay[local] | ||
| private predicate discardableAstNode(@file file, @erb_ast_node node) { |
Check warning
Code scanning / CodeQL
Dead code Warning
|
|
||
| /** Holds if `node` should be discarded, because it is part of the overlay base and is in a file that was also extracted as part of the overlay database. */ | ||
| overlay[discard_entity] | ||
| private predicate discardAstNode(@erb_ast_node node) { |
Check warning
Code scanning / CodeQL
Dead code Warning
e36a625 to
a9ddf00
Compare
Stacked on top of #19684